{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## A different approach\n",
    "The steps in doing data cleanup with some examples have been done in great detail in the Linear Regression  Data Cleanup.\n",
    "\n",
    "In this dataset most of the cleanup is in the strings in the columns names.\n",
    "Other than that there is nothing of major significance.\n",
    "Changing column names is something we attempt here as well so the lessons will be useful.\n",
    "\n",
    "In this exercise we are going to do something quite different, something more daring and more educational than the traditional data cleanup.  We are going to avoid the process of cleaning up the column names entirely.  We are just going to rename them x0,x1,...\n",
    "all the way to x516, keeping x562 and 563 as 'subject' and 'activity'.  \n",
    "\n",
    "Then we are going to use a black box approach to Random Forests, i.e. one where we dont really understand the variables and the model but we can do useful things anyway. Then we will see what we get as the top 10 variables and compare with the previous approach.\n",
    "\n",
    "This will allow use to compare and contrast the two methods - one where we used a lot of domain knowledge, the other this one, where we use a 'black box' approach.\n",
    "\n",
    "In practice we prefer the former approach but it is useful to see the latter in action as well."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Populating the interactive namespace from numpy and matplotlib\n"
     ]
    }
   ],
   "source": [
    "%pylab inline\n",
    "import pandas as pd\n",
    "from numpy import nan as NA\n",
    "samsungdata = pd.read_csv('../datasets/samsung/samsungdata.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "564"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "len(samsungdata.columns)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([subject, activity], dtype=object)"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samsungdata.columns[-2:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "sambak = samsungdata # make a copy and experiment on it not the original"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Unnamed: 0'"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sambak.columns[0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "cols=list(sambak.columns) # get the columns names and coerce to a mutable list. A pandas index is immutable."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "cols[0] = 'Renamed'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['Renamed',\n",
       " 'tBodyAcc-mean()-X',\n",
       " 'tBodyAcc-mean()-Y',\n",
       " 'tBodyAcc-mean()-Z',\n",
       " 'tBodyAcc-std()-X']"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "cols[0:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "newcols = [ \"x%d\"%(k) for k in range(0,len(cols)) ] # make a list with a new set of column names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['x0', 'x1', 'x2', 'x3', 'x4']"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "newcols[0:5]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['x562', 'x563']"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "newcols[-2:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'x563'"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "newcols[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "newcols[-2:] = cols[-2:] # replace the last two items with the human readable column names from the original dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['subject', 'activity']"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "newcols[-2:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "sambak.columns = newcols # replace the orig columns with newcols"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'x0'"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sambak.columns[0] # check the name"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0      1\n",
       "1      2\n",
       "2      3\n",
       "3      4\n",
       "4      5\n",
       "5      6\n",
       "6      7\n",
       "7      8\n",
       "8      9\n",
       "9     10\n",
       "10    11\n",
       "11    12\n",
       "12    13\n",
       "13    14\n",
       "14    15\n",
       "...\n",
       "7337    7338\n",
       "7338    7339\n",
       "7339    7340\n",
       "7340    7341\n",
       "7341    7342\n",
       "7342    7343\n",
       "7343    7344\n",
       "7344    7345\n",
       "7345    7346\n",
       "7346    7347\n",
       "7347    7348\n",
       "7348    7349\n",
       "7349    7350\n",
       "7350    7351\n",
       "7351    7352\n",
       "Name: x0, Length: 7352, dtype: int64"
      ]
     },
     "execution_count": 39,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sambak['x0']  # check the value - this is the row index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 49,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "sambak2=sambak[sambak.columns[1:-2]] # drop the first column and the last two columns to get the independent vars"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0     0.288585\n",
       "1     0.278419\n",
       "2     0.279653\n",
       "3     0.279174\n",
       "4     0.276629\n",
       "5     0.277199\n",
       "6     0.279454\n",
       "7     0.277432\n",
       "8     0.277293\n",
       "9     0.280586\n",
       "10    0.276880\n",
       "11    0.276228\n",
       "12    0.278457\n",
       "13    0.277175\n",
       "14    0.297946\n",
       "...\n",
       "7337    0.278414\n",
       "7338    0.344757\n",
       "7339    0.326647\n",
       "7340    0.223283\n",
       "7341    0.363768\n",
       "7342    0.276137\n",
       "7343    0.294230\n",
       "7344    0.221206\n",
       "7345    0.207861\n",
       "7346    0.237966\n",
       "7347    0.299665\n",
       "7348    0.273853\n",
       "7349    0.273387\n",
       "7350    0.289654\n",
       "7351    0.351503\n",
       "Name: x1, Length: 7352, dtype: float64"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sambak2['x1'] # check to see it's there"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try this:\n",
    "\n",
    "sambak2['x0']  # we want this to fail and give an exception like\n",
    "\n",
    "---------------------------------------------------------------------------\n",
    "KeyError                                  Traceback (most recent call last)\n",
    "<ipython-input-52-a199ccf2be41> in <module>()\n",
    "----> 1 sambak2['x0']  # check\n",
    "\n",
    "/Applications/anaconda15/lib/python2.7/site-packages/pandas/core/frame.pyc in __getitem__(self, key)\n",
    "   1926         else:\n",
    "   1927             # get column\n",
    "-> 1928             return self._get_item_cache(key)\n",
    "   1929 \n",
    "   1930     def _getitem_slice(self, key):\n",
    "\n",
    "/Applications/anaconda15/lib/python2.7/site-packages/pandas/core/generic.pyc in _get_item_cache(self, item)\n",
    "    568             return cache[item]\n",
    "    569         except Exception:\n",
    "--> 570             values = self._data.get(item)\n",
    "    571             res = self._box_item_values(item, values)\n",
    "    572             cache[item] = res\n",
    "\n",
    "/Applications/anaconda15/lib/python2.7/site-packages/pandas/core/internals.pyc in get(self, item)\n",
    "   1381 \n",
    "   1382     def get(self, item):\n",
    "-> 1383         _, block = self._find_block(item)\n",
    "   1384         return block.get(item)\n",
    "   1385 \n",
    "\n",
    "/Applications/anaconda15/lib/python2.7/site-packages/pandas/core/internals.pyc in _find_block(self, item)\n",
    "   1523 \n",
    "   1524     def _find_block(self, item):\n",
    "-> 1525         self._check_have(item)\n",
    "   1526         for i, block in enumerate(self.blocks):\n",
    "   1527             if item in block:\n",
    "\n",
    "/Applications/anaconda15/lib/python2.7/site-packages/pandas/core/internals.pyc in _check_have(self, item)\n",
    "   1530     def _check_have(self, item):\n",
    "   1531         if item not in self.items:\n",
    "-> 1532             raise KeyError('no item named %s' % com.pprint_thing(item))\n",
    "   1533 \n",
    "   1534     def reindex_axis(self, new_axis, method=None, axis=0, copy=True):\n",
    "\n",
    "KeyError: u'no item named x0'"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 55,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    -0.058627\n",
       "1    -0.054317\n",
       "2    -0.049118\n",
       "3    -0.047663\n",
       "4    -0.043892\n",
       "5    -0.042126\n",
       "6    -0.043010\n",
       "7    -0.041976\n",
       "8    -0.037364\n",
       "9    -0.034417\n",
       "10   -0.034681\n",
       "11   -0.035852\n",
       "12   -0.035998\n",
       "13   -0.035063\n",
       "14    0.036444\n",
       "...\n",
       "7337    0.043458\n",
       "7338    0.040161\n",
       "7339    0.023146\n",
       "7340    0.014452\n",
       "7341    0.016289\n",
       "7342   -0.005105\n",
       "7343   -0.001647\n",
       "7344    0.009538\n",
       "7345    0.027878\n",
       "7346    0.048907\n",
       "7347    0.049819\n",
       "7348    0.050053\n",
       "7349    0.040811\n",
       "7350    0.025339\n",
       "7351    0.036695\n",
       "Name: x561, Length: 7352, dtype: float64"
      ]
     },
     "execution_count": 55,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sambak2[sambak2.columns[-1]] # want to check what the last columns is"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 56,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "25    409\n",
       "21    408\n",
       "26    392\n",
       "30    383\n",
       "28    382\n",
       "27    376\n",
       "23    372\n",
       "17    368\n",
       "16    366\n",
       "19    360\n",
       "1     347\n",
       "29    344\n",
       "3     341\n",
       "15    328\n",
       "6     325\n",
       "14    323\n",
       "22    321\n",
       "11    316\n",
       "7     308\n",
       "5     302\n",
       "8     281\n",
       "dtype: int64"
      ]
     },
     "execution_count": 56,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samsungdata['subject'].value_counts()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([ 1,  3,  5,  6,  7,  8, 11, 14, 15, 16, 17, 19, 21, 22, 23, 25, 26,\n",
       "       27, 28, 29, 30])"
      ]
     },
     "execution_count": 57,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samsungdata['subject'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 59,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "samtest = samsungdata[samsungdata['subject'] >= 27]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 60,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "samtrain = samsungdata[samsungdata['subject'] <= 6]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 66,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "samval2 = samsungdata[samsungdata['subject'] < 27]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "samval = samval2[samsungdata['subject'] >= 21 ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([21, 22, 23, 25, 26])"
      ]
     },
     "execution_count": 69,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samval['subject'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 70,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 3, 5, 6])"
      ]
     },
     "execution_count": 70,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samtrain['subject'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 71,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([27, 28, 29, 30])"
      ]
     },
     "execution_count": 71,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samtest['subject'].unique()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Now we are ready to create the model, validate and test it."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 72,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import randomforests as rf\n",
    "samtrain = rf.remap_col(samtrain,'activity')\n",
    "samval = rf.remap_col(samval,'activity')\n",
    "samtest = rf.remap_col(samtest,'activity')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 2, 1, 4, 6, 5])"
      ]
     },
     "execution_count": 73,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samtrain['activity'].unique()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 103,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import sklearn.ensemble as sk\n",
    "rfc = sk.RandomForestClassifier(n_estimators=50, compute_importances=True, oob_score=True)\n",
    "train_data = samtrain[samtrain.columns[1:-2]]\n",
    "train_truth = samtrain['activity']\n",
    "model = rfc.fit(train_data, train_truth)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 104,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.98403041825095061"
      ]
     },
     "execution_count": 104,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "rfc.oob_score_"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 120,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[(0.036111205314263831, 'x41'),\n",
       " (0.022944421102129454, 'x49'),\n",
       " (0.036627075843360363, 'x50'),\n",
       " (0.01969226434644121, 'x52'),\n",
       " (0.047155838268418251, 'x53'),\n",
       " (0.020936155077376319, 'x56'),\n",
       " (0.030973069486755728, 'x57'),\n",
       " (0.018972471372629626, 'x344'),\n",
       " (0.017855365021638578, 'x558'),\n",
       " (0.038597146390154873, 'x559')]"
      ]
     },
     "execution_count": 120,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "fi = enumerate(rfc.feature_importances_)\n",
    "cols = samtrain.columns\n",
    "[(value,cols[i]) for (i,value) in fi if value > 0.017]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 124,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "fi = enumerate(rfc.feature_importances_)\n",
    "cols = samtrain.columns\n",
    "top10 = [(cols[i]) for (i,value) in fi if value > 0.017]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['x41', 'x49', 'x50', 'x52', 'x53', 'x56', 'x57', 'x344', 'x558', 'x559']"
      ]
     },
     "execution_count": 125,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "top10"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So here we see a set of top 10 features but we have absolutely no idea what they are or mean.\n",
    "We came at the 0.17 figure by trial and error, adjusting until we got 10.  The number here is 0.17. But the number in the earlier meathod was 0.04. What does this mean?  \n",
    "Since we are using many more features (all of them actually) the importance meausre is spread more widely, so individual importance scores are smaller.  But we can still rank them and pick the top 10."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 113,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "# pandas data frame adds a spurious unknown column in 0 position hence starting at col 1\n",
    "# not using subject column, activity ie target is in last columns hence -2 i.e dropping last 2 cols\n",
    "\n",
    "val_data = samval[samval.columns[1:-2]]\n",
    "val_truth = samval['activity']\n",
    "val_pred = rfc.predict(val_data)\n",
    "\n",
    "test_data = samtest[samtest.columns[1:-2]]\n",
    "test_truth = samtest['activity']\n",
    "test_pred = rfc.predict(test_data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 114,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "mean accuracy score for validation set = 0.860673\n",
      "mean accuracy score for test set = 0.919192\n"
     ]
    }
   ],
   "source": [
    "print(\"mean accuracy score for validation set = %f\" %(rfc.score(val_data, val_truth)))\n",
    "print(\"mean accuracy score for test set = %f\" %(rfc.score(test_data, test_truth)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 115,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "import sklearn.metrics as skm\n",
    "test_cm = skm.confusion_matrix(test_truth,test_pred)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 116,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAPYAAADzCAYAAAC13+t7AAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJzt3X1UVHX+B/D3AEPIg8iDDggkIBKMosyKoK4krgJLimma\nikocldrK9oS6bVaWD6WYZaZ5bNMtt0fC2jX8mbDaBki2mxlirWSYMgaIKA+iCDIw8/39QdxlkGEu\nd54ul8/rnHsOM3Pnfj+Xmc/c7/3eez9XxhhjIIRIip2tAyCEmB8lNiESRIlNiARRYhMiQZTYhEgQ\nJTYhEiTqxG5paUFycjKGDBmChQsXCl7Ohx9+iMTERDNGZjtFRUUICwsT9N6ffvoJkZGRGDx4MHbv\n3m3myKxLrVbDzs4OOp3O1qGIEzODDz/8kI0fP565uroyX19flpSUxL766iuTl/vee++x6OhoptVq\nzRCl+MlkMnbhwgWLLX/58uVs9erVZlve+vXr2dKlS82yrL6ue3l5OZPJZLy+G/n5+czf39+U8Pod\nk7fYr732GlatWoV169bh6tWrqKiowMqVK3Ho0CGTf3QuXbqE0NBQ2NmJumNhVqyX84Xa29tNWval\nS5egVCoFvVer1ZrUNh+9rbvYOMtkkPGcPD09rR+gKb8K169fZ66uruzTTz81OM/t27fZk08+yYYP\nH86GDx/OMjIyWGtrK2Os45fUz8+Pbd++nQ0bNoz5+vqy/fv3M8YYe+GFF5ijoyOTy+XM1dWVvf32\n23dsIbr/au/fv58FBwczNzc3FhQUxD788EPu+SlTpnDvO3HiBIuKimLu7u5swoQJ7Ouvv+Zemzp1\nKnv++efZb3/7W+bm5sYSEhJYbW1tj+vWGf+2bdvY0KFDma+vLzt48CD7/PPP2ahRo5inpyfLzMzk\n5v/mm2/YxIkT2ZAhQ5ivry974oknmEajYYwxFhsby2QyGXNxcWGurq7swIED3PJffvll5uPjwx56\n6CG9rc/PP//MPD09WXFxMWOMsaqqKubt7c0KCwvviHXatGnM3t6eOTk5MTc3N3b+/Hl2/fp1lpqa\nyoYOHcpGjBjBXnrpJabT6bj/2eTJk9mqVauYl5cXe/755/WWl5ubq/f5REZGct+J5cuXM19fX+bn\n58fWrVvHfT7nz59n9957L3N3d2fe3t5s0aJFBte9O61Wy9asWcO8vb1ZcHAw2717t95n/84777Dw\n8HDm5ubGgoOD2VtvvcUYY6ypqYk5OTkxOzs75urqytzc3Fh1dXWvnwUfANhLPCcT00wQk1rMzc1l\nDg4OvXaHnn/+eTZp0iR27do1du3aNTZ58mTuS5Kfn88cHBzY+vXrWXt7Ozty5AhzdnZm169fZ4wx\ntmHDBpaamsota8OGDQYTu6mpiQ0ePJiVlZUxxhi7cuUKO3v2LGNMP7Hr6urYkCFD2AcffMC0Wi3L\nyspiHh4erL6+njHWkdghISHs/PnzrKWlhcXFxbG1a9f2uG6d8b/44ousvb2d7du3j3l5ebHFixez\npqYmdvbsWTZo0CCmVqsZY4x999137JtvvmFarZap1WoWHh7OXn/9dW553bujnctfu3Yt02g0rKWl\n5Y5u5b59+5hSqWTNzc0sISGBPfXUUwY/i7i4OPb2229zj1NTU9mcOXNYU1MTU6vVLDQ0lHt9//79\nzMHBge3evZtptVrW0tJyx/K6fz6MMTZnzhz26KOPsubmZnb16lUWHR3NJdmiRYvYli1bGGOMtba2\nshMnThhc9+7efPNNFhYWxiorK1l9fT2Li4tjdnZ23Hfv888/ZxcvXmSMMVZYWMicnZ25H7yCgoI7\nuuLGPgtjALCXeU62SGyT+rh1dXXw9vbutav80Ucf4YUXXoC3tze8vb2xfv16vP/++9zrcrkcL7zw\nAuzt7ZGUlARXV1f89NNPnb0Jve4ZM9JVs7Ozww8//ICWlhYoFIoeu52ff/457rnnHixZsgR2dnZY\ntGgRwsLCuF0HmUyGZcuWISQkBE5OTliwYAFKSkoMtimXy/Hcc8/B3t4eCxcuRH19PTIyMuDi4gKl\nUgmlUsm9/ze/+Q2io6NhZ2eHESNG4JFHHkFhYaHRddq4cSPkcjmcnJzueD09PR0hISGIjo5GTU0N\nNm/e3OvyOv+HWq0W2dnZyMzMhIuLC0aMGIE1a9bofTbDhw/HypUrYWdn12Pb3T+fmpoa5ObmYseO\nHRg0aBCGDh2KjIwMfPzxxwAAR0dHqNVqVFVVwdHREZMnT+411q4OHDiAVatWwc/PDx4eHnj22Wf1\n2r7vvvsQFBQEALj33nuRkJCAoqIivXXuSshn0Z0Dz8kWTEpsLy8v1NbW9joyefnyZYwYMYJ7fPfd\nd+Py5ct6y+j6w+Ds7IympqY+x+Li4oLs7Gz85S9/wfDhwzFr1izuB6J7PHfffbfecyNGjNCLycfH\nh/t70KBBvcbj5eUFmUzGzQsACoVC7/23bt0CAJSVlWHWrFnw9fWFu7s7nnvuOdTV1fW6XkOHDoWj\no2Ov86Snp+Ps2bP44x//CLlc3uu8nbHW1taira3tjs+mqqqKexwQENDrsrq7dOkS2tra4OvrCw8P\nD3h4eODRRx/FtWvXAADbtm0DYwzR0dEYM2YM9u/fz3vZ1dXVevF0/wxzc3MxceJEeHl5wcPDA0eO\nHOn1fyvks+huEM/JFkxK7EmTJuGuu+7CwYMHDc4zfPhwqNVq7vEvv/yC4cOHC2rP1dUVzc3N3OMr\nV67ovZ6QkICjR4/iypUrCAsLw8MPP3zHMvz8/HDp0iW95y5dugQ/Pz9BMfXFY489BqVSiZ9//hmN\njY3YvHmz0cM1nYloSFNTEzIyMpCeno7169ejoaGBVyze3t6Qy+V3fDb+/v682+7eUwsICMBdd92F\nuro6NDQ0oKGhAY2Njfjhhx8AdPzg7d27F1VVVXjrrbfw+OOP4+LFi7zi9fX1xS+//KIXa6fW1lbM\nmzcPf/7zn3H16lU0NDTgvvvu47bUPa2HkM+iOznPyRZMSmx3d3ds2rQJK1euRE5ODpqbm9HW1obc\n3Fw8/fTTAICUlBS89NJLqK2tRW1tLTZt2oTU1FRB7UVGRuL48eOoqKhAY2MjMjMzudeuXr2KnJwc\n3Lp1C3K5HC4uLrC3t79jGUlJSSgrK0NWVhba29uRnZ2Nc+fOYdasWdw8xrr8QjU1NcHNzQ3Ozs44\nd+4c3nzzTb3XFQoFLly40KdlPvnkk4iOjsbevXsxc+ZMPProo73O37lu9vb2WLBgAZ577jk0NTXh\n0qVL2LFjB5YuXcq7bYVCAbVazS3T19cXCQkJWL16NW7evAmdTocLFy7g+PHjAIBPPvkElZWVAIAh\nQ4ZAJpNxPw7G1n3BggXYtWsXqqqq0NDQgK1bt3KvaTQaaDQabrcwNzcXR48e1Yuzrq4ON27c4J4z\n9lnwIdmuOACsXr0ar732Gl566SUMGzYMd999N/bs2YO5c+cCANatW4eoqCiMHTsWY8eORVRUFNat\nW8e9v7etQufhgk4zZszAwoULMXbsWEyYMAHJycnc6zqdDjt27ICfnx+8vLxQVFTEfVhdl+Pl5YXD\nhw9j+/bt8Pb2xquvvorDhw/rHZLo2mb3GHqKsbfHXb366qv46KOPMHjwYDzyyCNYtGiR3vwbNmxA\nWloaPDw88Omnnxpsu/O5nJwcHD16lFvP1157DcXFxcjKyuIV7xtvvAEXFxcEBwcjNjYWS5YswbJl\ny3itNwA8+OCDADr+p1FRUQCA9957DxqNBkqlEp6ennjwwQe5ntWpU6cwceJEuLm54f7778euXbsQ\nGBjY47p39/DDDyMxMRHjxo1DVFQU5s2bx8Xn5uaGXbt2YcGCBfD09ERWVhbuv/9+7r1hYWFISUlB\ncHAwPD09ceXKFaOfBR9i3mLLmKU2T4RImEwmw8c8510E6x+jt1VPgZB+z1ZbYz5sdkpXXl4ewsLC\nMGrUKLz88stWaXP58uVQKBSIiIiwSnudKioqMG3aNIwePRpjxozBrl27LN7m7du3ERMTg8jISCiV\nSjzzzDMWb7MrrVYLlUqF5ORkq7UZGBiIsWPHQqVSITo62uLtCe2KG/o+LFy4ECqVCiqVCkFBQVCp\nVNx7MjMzMWrUKISFhemNHxhk9SPnjLH29nY2cuRIVl5ezjQaDRs3bhwrLS21eLvHjx9nxcXFbMyY\nMRZvq6vq6mp2+vRpxhhjN2/eZKGhoVZZ31u3bjHGGGtra2MxMTGsqKjI4m122r59O1u8eDFLTk62\nWpuBgYGsrq7OKm0BYMd4Tt3TjM/3Yc2aNezFF19kjDF29uxZNm7cOKbRaFh5eTkbOXKk0XPkbbLF\nPnnyJEJCQhAYGAi5XI5FixYhJyfH4u3GxsbCw8PD4u105+Pjg8jISAAdh+zCw8P1jptbirOzM4CO\nUWOtVmu1c5YrKytx5MgRpKenW33f0prtCR0VN/Z9YIzhwIEDSElJAdAxSJqSkgK5XI7AwECEhITg\n5MmTvcZmk8SuqqrSO9nA399f78QIKVOr1Th9+jRiYmIs3pZOp0NkZCQUCgWmTZsm+AKQvlq1ahVe\neeUVq1+8I5PJMGPGDERFRWHfvn0Wb88co+I9fR+KioqgUCgwcuRIAB0nVXU9v4BPvthk8KyvhxWk\noqmpCfPnz8fOnTvh6upq8fbs7OxQUlKCxsZGJCYmoqCgAHFxcRZt8/Dhwxg2bBhUKhUKCgos2lZ3\nJ06cgK+vL65du4b4+HiEhYUhNjbWYu0ZSp7iXydjDH0fsrKysHjx4l7fa/TkIR7tm52fnx8qKiq4\nxxUVFXq/SFLU1taGefPmYenSpZgzZ45V23Z3d8fMmTNx6tQpi7f19ddf49ChQwgKCkJKSgq+/PJL\nPPTQQxZvF+g4QQboOA137ty5RrurpjK0hY4B8FiXqSeGvg/t7e04ePCgXmGR7vlSWVlp/ExJcw0m\n9EVbWxsLDg5m5eXlrLW11WqDZ4x1XBFm7cEznU7HUlNTWUZGhtXavHbtGmtoaGCMMdbc3MxiY2PZ\nF198YbX2Geu4qmrWrFlWaevWrVvsxo0bjLGOSzUnT57M/vnPf1qsPQDsDM+pe5r19n3Izc1lcXFx\nes91Dp61trayixcvsuDgYO7yWkNs0hV3cHDA7t27kZiYCK1WixUrViA8PNzi7aakpKCwsBB1dXUI\nCAjApk2buDOtLOnEiRP44IMPuEMxQMfhi9///vcWa7O6uhppaWnQ6XTQ6XRITU3F9OnTLdaeIdba\n7aqpqeHOdmxvb8eSJUuQkJBg0TaFHsfu7fuQnZ3NDZp1UiqVWLBgAZRKJRwcHLBnzx6j/1c684wQ\nAWQyGcp5zhsEOvOMkH5DzGeeUWITIpCYk0fMsREianK+2WNaDUpBTE7sgXpMmkhTX/aFHaSc2ACw\nXuD7CgDECXzvRsGtmtryYBPa/ScAoTcuuGF8FoMKIHx9+1Obpra7sU9zy++s4yEa1BUnRCDeW2wb\nEHFohIib/C5bR2CYTRM7cMC1PNJG7QYOkDat3K6IN4uU2FYVYqN2AwdIm1ZulxKbEAkScfaIODRC\nRI5GxQmRIBFnj4hDI0TkaFScEAkScfYYraBiizLBhPQLIr7HT6+JrdVq8cQTTyAvLw+lpaXIysrC\njz/+aK3YCBE3e55TN8bqzG/fvh12dnaor6/nnutrXfFef0+6lgkGwJUJtka1E0JET+DWWC6XY8eO\nHYiMjERTUxPGjx+P+Ph4hIeHo6KiAseOHdO7vXFpaSmys7NRWlqKqqoqzJgxA2VlZb1Wge11iz2Q\nywQTYpTArnhvdcVXr16Nbdu26c0vpK54r785fC/JLOjydyBseUYZIX2h/nUSyED2FNR3TLwi6FJX\nPCcnB/7+/hg7dqzePJcvX8bEiRO5xybXFedbJjiOzxoQIjqB0N8MFfbt7QYOd8X5dkydNhq47XfX\nuuJ2dnbYsmULjh07xr3e27XhJtUVj4qKwvnz56FWq6HRaJCdnY3Zs2f3ukBCBgwTRsW71xW/cOEC\n1Go1xo0bh6CgIFRWVmL8+PGoqakRVFe818TuWiZYqVRi4cKFNHBGSCeBo+KMMaxYsQJKpRIZGRkA\ngIiICNTU1KC8vBzl5eXw9/dHcXExFAoFZs+ejY8//hgajQbl5eU4f/680buJGh3XS0pKQlJSUp/W\nl5ABQeCoeE91xbds2aKXZ1272japKy6TyUwqUiSUaaWRTGFKaSRTmFIaifCzkXfNM5lMBpbKb6my\n96muOCH9B13dRYgEiTh7RBwaISLnZOsADKPEJkQo6ooTIkEizh4Rh0aIyIk4e0QcGiEiR11xQiRI\nxNljltBscbIIe7Zv91kyF9lr1j3RgHN7g23aJYZJPbEJGZComCEhEiTi7BFxaISInIizR8ShESJy\nNCpOiASJOHuM1hUnhBggsIKKofLDn3zyCUaPHg17e3sUFxfrvces5YcJIb0Q2BU3VH44IiICBw8e\nxB/+8Ae9+YWUH6bEJkQogVd3+fj4wMfHB4B++eHp06f3OL+h8sNdK5d2R11xQoQywy1+upYfNuTy\n5ct61YFNLj9MCOmFga54wX+BgrPG3961/LCrq2ufmjZW84wSmxChDGRPXGTH1Glj9p3zdC8/3Buz\nlx8GgOXLl0OhUCAiIsLYrIQMLAK74j2VH+5pnk5Cyg8bTexly5YhLy/P2GyEDDwC64p3lh/Oz8+H\nSqWCSqVCbm4uPvvsMwQEBOA///kPZs6cyZUj7lp+OCkpyXzlh9VqNZKTk/HDDz/cuQCZDKCruyyP\nru6ygj6WH+Z5RyDZVCo/TEj/If1TSgu6/B0Iut8m6R/UsMTdNsXATKHFmWcxhFhVIEy626b0E5uQ\nAUjE2WN0VDwlJQWTJ09GWVkZAgICsH//fmvERYj4CRwVtwajvzlZWVnWiIOQ/kfEW2wRh0aIyFHN\nM0IkSMTZI+LQCBE5EWePiEMjROREnD0iDo0QcWPSP/OMkIFHK+LsEXFohIgbJTYhEtR6lyPPOTUW\njaMnVPOMEIG09va8pu56Kl5y8uRJREdHQ6VSYcKECfj222+51/paehjgeT12rwuw0fXYtmkTYKNs\n81soO2+b9R1Y+nY99jXGr07ZUFmT3nKLiorg6uqKhx56iKtxEBcXh2eeeQaJiYnIzc3Ftm3bkJ+f\nj9LSUixevBjffvst79LDAG2xCRGsHfa8pu5iY2Ph4eGh95yvry8aGxsBANevX+dqmhkqPWwM7WMT\nIpDWjOmzdetWTJkyBX/605+g0+nw73//G0BH6eGu9cP5lB4GKLEJEUxr4NKtfxdo8J+Ctj4ta8WK\nFdi1axfmzp2LTz75BMuXL8exY8d6nNdYvTOAEpsQwQwldnTcIETHDeIev77xltFlnTx5El988QUA\nYP78+UhPTwcgrPQwQPvYhAjWCkdeEx8hISEoLOyo4PLll18iNDQUgLDSwwBtsQkRTOg+dkpKCgoL\nC1FbW4uAgABs2rQJe/fuxcqVK9Ha2opBgwZh7969APRLDzs4OPAqPQzQ4a4+o8NdUta3w11nWCiv\necfJyqj8MCH9haF9bDGgxCZEoJ6OUYsFJTYhApnzOLa5iTcyQkROzF1xoyNBFRUVmDZtGkaPHo0x\nY8Zg165d1oiLENHTwJHXZAtGt9hyuRw7duxAZGQkmpqaMH78eMTHxyM8PNwa8REiWv16H9vHxwc+\nPj4AAFdXV4SHh+Py5cuU2GTAk8w+tlqtxunTpxETE9PtlYIufweCbspH+gc1TLkpn5j3sXkndlNT\nE+bPn4+dO3fC1bX7dahx5o2KEKsIhCk35ev3id3W1oZ58+Zh6dKlmDNnjqVjIqRf6Nf72IwxrFix\nAkqlEhkZGdaIiZB+QSPie/wYPdx14sQJfPDBB8jPz4dKpYJKpUJeXp41YiNE1LSw5zXZgtEt9pQp\nU6DT6awRCyH9Sr/uihNCeibmw11UaIEQgYR2xXsqP7xhwwb4+/tzu7u5ubnca0LKD1NiEyKQ0MRe\ntmzZHeNUMpkMq1evxunTp3H69GkkJSUBAEpLS5GdnY3S0lLk5eXh8ccf57VrTIlNiEBCE7un8sMA\neizGILT8MCU2IQK14i5eE19vvPEGxo0bhxUrVuD69esAOsoP+/v7c/PwLT9MiU2IQIa20GUF1cjb\ncIqb+HjsscdQXl6OkpIS+Pr6Ys2aNQbnpfLDhFiQoWPUgXEjEBg3gnt8bKPxrvOwYcO4v9PT05Gc\nnAyAyg8TYnVCb/HTk+rqau7vgwcPciPmVH6YECszV/nhjRs3oqCgACUlJZDJZAgKCsJbb70FYECW\nHx5Y2OsbbdKuLGOrTdoFPG3Q5iN9Kj/8PHuW17wvyrZQ+WFC+ot+f9kmIeROfG/fYwuU2IQIJOZz\nxcUbGSEiR11xQiSIEpsQCaLrsQmRINrHJkSCqCtOiATZ6vY9fFBiEyJQv97Hvn37NqZOnYrW1lZo\nNBrcf//9yMzMtEZshIhav97HdnJyQn5+PpydndHe3o4pU6bgq6++wpQpU6wRHyGi1e/3sZ2dnQEA\nGo0GWq0Wnp62OEGfEHERc2Lzuh5bp9MhMjISCoUC06ZNg1KptHRchIieOa/HNjdeiW1nZ4eSkhJU\nVlbi+PHjKCgo6DZHQZdJbcbwCLGknwD8X5epb7Rw4DV111P54aeeegrh4eEYN24cHnjgATQ2NnKv\nWbz8sLu7O2bOnIlTp7rXcYrrMgX2ZZGE2NA9AJK7TH2jgSOvqbueyg8nJCTg7NmzOHPmDEJDQ7kB\naouVH66treUqJra0tODYsWNQqVS8VpwQKRPaFe+p/HB8fDzs7DrSMSYmBpWVlQCElx82OnhWXV2N\ntLQ06HQ66HQ6pKamYvr06bxWnBAps9ThrnfeeQcpKSkAOsoPT5w4kXuNb/lho5FFRESguLjYhDAJ\nkSZDo+I3Ck7jZkGJoGVu3rwZjo6OWLx4scF5qPwwIRZkKLFd4qLgEhfFPa7a+Ddey/vb3/6GI0eO\n4F//+hf3HJUfJsTKzHl/7Ly8PLzyyivIycmBk5MT9zyVHybEyvpy+56ueio/nJmZCY1Gg/j4eADA\npEmTsGfPHio/LHVUftga+lZ+OJSd4TVvmWwclR8mpL8Q8ymllNiECNSvL9skhPSsX1+2SQjpGXXF\nCZEgSmxJUdikVVmGbY48aK6vtUm7jkP43TDello1VPOMEMnRtos3fcQbGSEip22nrjghkkOJTYgE\ntbdRYhMiOTqteNNHvJERInbUFSdEgm6LN33EGxkhYtdu6wAMo0ILhAjVznPqwc6dOxEREYExY8Zg\n586dAID6+nrEx8cjNDQUCQkJXBFRISixCRFKYGL/97//xV//+ld8++23OHPmDA4fPowLFy5g69at\niI+PR1lZGaZPn46tW4VfC0+JTYhQbTynbs6dO4eYmBg4OTnB3t4eU6dOxd///nccOnQIaWlpAIC0\ntDR89tlngkPjldharRYqlQrJyX0vqk6IZGl5Tt2MGTMGRUVFqK+vR3NzM44cOYLKykrU1NRAoei4\nFkGhUKCmpkZwaLwGz3bu3AmlUombN28KbogQyTE0eHa6ACgpMPi2sLAwPP3000hISICLiwsiIyNh\nb69/6Ewmk/GqbWaI0S12ZWUljhw5gvT0dKvXbSJE1G4bmMLjgJQN/5t6sHz5cpw6dQqFhYXw8PBA\naGgoFAoFrly5AqDjRh3Dhg0THJrRxF61ahVeeeUV7vYjhJBfmTAqfvXqVQDAL7/8gn/84x9YvHgx\nZs+ejXfffRcA8O6772LOnDmCQ+u1K3748GEMGzYMKpWqhztsdtX1tUDQjflI//Ddr5NAJhzHnj9/\nPurq6iCXy7Fnzx64u7tj7dq1WLBgAd5++20EBgbiwIEDgpffa/nhZ599Fu+//z4cHBxw+/Zt3Lhx\nA/PmzcN77733vwUMuPLDtim0AAgfSDGF5rptyh7bptBCVJ/KD+PvPHdN58msvhvba/96y5YtqKio\nQHl5OT7++GP87ne/00tqQgY0gYe7rKFPp5SaMkpHiOT0cChLLHgn9tSpUzF16lRLxkJI/yLic8Xp\nIhBChLpt6wAMo8QmRCjaYhMiQZTYhEgQJTYhEmSjQ1l8UGITIpQUDncRQrqhUXFCJIj2sQmRINrH\nlhLbXIxhK45D9tqkXVYVZfU2ZX59fAPtYxMiQdQVJ0SCRJzYVBaFEKFMuGzz+vXrmD9/PsLDw6FU\nKvHNN99QXXFCRKGV59SDJ598Evfddx9+/PFHfP/99wgLCzNrXfFeK6jwWsCAq6Ay0PR1RMk8WNUj\nVm9T5oe+VVBJ4Zk6WfoVVBobG6FSqXDx4kW92cLCwlBYWMgVNYyLi8O5c+d4x98V7WMTIpShw13X\nCjomA8rLyzF06FAsW7YMZ86cwfjx4/H666+bta44dcUJEcrQDQI844B7Nvxv6qa9vR3FxcV4/PHH\nUVxcDBcXlzu63RavK04IMUBg+WF/f3/4+/tjwoQJADoqlhYXF8PHx8d6dcUJIQYITGwfHx8EBASg\nrKwMAPDFF19g9OjRSE5Otk5dcUJIL0w4pfSNN97AkiVLoNFoMHLkSOzfvx9ardY6dcV5LYBGxSWO\nRsV7nFcmA2J5pk6R9euK89piBwYGYvDgwbC3t4dcLsfJkyctHRch4ifiM894JbZMJkNBQQE8PT0t\nHQ8h/YcUru6iO20S0o2Ir+7iNSouk8kwY8YMREVFYd++fZaOiZD+wYS7bVoary32iRMn4Ovri2vX\nriE+Ph5hYWGIjY3tMkdBl78DQXfbJP1BwddAwb9NWICI97H7PCq+ceNGuLq6Ys2aNR0LoFFxiaNR\n8R7nlcmAEJ6p87PI7rYJAM3Nzbh58yYA4NatWzh69CgiIiIsHhghomfC1V2WZrQrXlNTg7lz5wLo\nOMd1yZIlSEhIsHhghIieiLviRhM7KCgIJSUl1oiFkP5FCoe7CCHdiPhwFyU2IUL15644IcQASmxC\nJIj2sQmRIBFvsW1caEFN7Uq23Z9s0GbH2WRid/v2bcTExCAyMhJKpRLPPPMMAEip/LCa2pVsu2U2\naNPEU0Q1iuMqAAACUklEQVStxMnJCfn5+SgpKcH333+P/Px8fPXVV2YtP0ylkQixAWdnZwCARqOB\nVquFh4cHDh06hLS0NABAWloaPvvsM8HLp8QmRDDhtwLR6XSIjIyEQqHAtGnTMHr0aLOWHzZTaSRC\npKFPF4Gg2cCrxwEUdXm82eByGxsbkZiYiMzMTDzwwANoaGjgXvP09ER9fT2veLozeVScCjCQgcvQ\n8a5Jv06dNhtcgru7O2bOnInvvvuOuwOIj48PlR8mxHZaeE76amtruRHvlpYWHDt2DCqVCrNnz6by\nw4TYnrAzVKqrq5GWlgadTgedTofU1FRMnz4dKpVKPOWHCRmIOvaxy3nOHSTO8sOEkJ6I95xSSmxC\nBBPvOaWU2IQIRltsQiTozhFvsaDEJkQw6ooTIkHUFSdEgmiLTYgE0RabEAmiLTYhEkRbbEIkiA53\nESJBtMUmRIJoH5sQCRLvFpsKLRAiWDvP6U55eXkICwvDqFGj8PLLL5s9MkpsQgQTVsxQq9XiiSee\nQF5eHkpLS5GVlYUff/zRrJFRYhMimLAt9smTJxESEoLAwEDI5XIsWrQIOTk5Zo2MEpsQwYTVPKuq\nqkJAQAD32N/fH1VVVWaNjAbPCBFsA6+5XF1d9R5bo2Q3JTYhAphSw8zPzw8VFRXc44qKCvj7+5sj\nLA51xQmxsqioKJw/fx5qtRoajQbZ2dmYPXu2WdugLTYhVubg4IDdu3cjMTERWq0WK1asQHh4uFnb\noPLDhEgQdcUJkSBKbEIkiBKbEAmixCZEgiixCZEgSmxCJOj/ATRSWg/4/CHgAAAAAElFTkSuQmCC\n"
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "import pylab as pl\n",
    "pl.matshow(test_cm)\n",
    "pl.title('Confusion matrix for test data')\n",
    "pl.colorbar()\n",
    "pl.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As before, we now compute some commonly used measures of prediction \"goodness\". "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 117,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Accuracy = 0.919192\n",
      "Precision = 0.921428\n",
      "Recall = 0.919192\n",
      "F1 score = 0.919609\n"
     ]
    }
   ],
   "source": [
    "# Accuracy\n",
    "print(\"Accuracy = %f\" %(skm.accuracy_score(test_truth,test_pred)))\n",
    "# Precision\n",
    "print(\"Precision = %f\" %(skm.precision_score(test_truth,test_pred)))\n",
    "# Recall\n",
    "print(\"Recall = %f\" %(skm.recall_score(test_truth,test_pred)))\n",
    "# F1 Score\n",
    "print(\"F1 score = %f\" %(skm.f1_score(test_truth,test_pred)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "As we somewhat expected this model is more accurate.  \n",
    "BUT ...  \n",
    "we have no idea what the variables are and we gain no knowledeg or intuition from this model.\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "HOWEVER ...\n",
    "We did save a lot of tedious work wrangling text columns.\n",
    "Perhaps we could go bacj and try to understand these features by mapping back to the original dataset and extracting these columns.\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 126,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "origindx = [int(x[1:]) for x in top10]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 127,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[41, 49, 50, 52, 53, 56, 57, 344, 558, 559]"
      ]
     },
     "execution_count": 127,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "origindx\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 128,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([x41, x49, x50, x52, x53, x56, x57, x344, x558, x559], dtype=object)"
      ]
     },
     "execution_count": 128,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samsungdata.columns[origindx]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 129,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "samorig = pd.read_csv('../datasets/samsung/samsungdata.csv')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 130,
   "metadata": {
    "collapsed": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "Index([tGravityAcc-mean()-X, tGravityAcc-mad()-Z, tGravityAcc-max()-X, tGravityAcc-max()-Z, tGravityAcc-min()-X, tGravityAcc-sma(), tGravityAcc-energy()-X, fBodyAcc-bandsEnergy()-25,48.2, angle(tBodyGyroJerkMean,gravityMean), angle(X,gravityMean)], dtype=object)"
      ]
     },
     "execution_count": 130,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "samorig.columns[origindx]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "So these are the columns names in the raw data set. But still we have no immediate intuition about the project."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 2",
   "language": "python",
   "name": "python2"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "2.7.8"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}
